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Abstract. - We characterize galaxy correlations in the Sloan Digital Sky Survey by measuring 
several moments of galaxy counts in spheres. We firstly find that the average counts grows as 
a power-law function of the distance with an exponent D = 2.1 ± 0.05 for r £ [0.5,20] Mpc/h 
and D = 2.8 ± 0.05 for r £ [30, 150] Mpc/h. In order to estimate the systematic errors in these 
measurements we consider the counts variance finding that it shows systematic finite size effects 
which depend on the samples sizes. We clarify, by making specific tests, that these are due to 
galaxy long-range correlations extending up to the largest scales of the sample. The analysis of 
mock galaxy catalogs, generated from cosmological N-body simulations of the standard LCDM 
model, shows that for r < 20 Mpc/h the counts exponent is D ~ 2.0, weakly dependent on galaxy 
luminosity, while _D = 3 at larger scales. In addition, contrary to the case of the observed galaxy 
samples, no systematic finite size effects in the counts variance are found at large scales, a result 
that agrees with the absence of large scale (r ~ 100 Mpc/h) correlations in the mock catalogs. We 
thus conclude that the observed galaxy distribution is characterized by correlations, fiuctuations 
and hence structures, which are larger, both in amplitude and in spatial extension, than those 
predicted by the standard model LCDM of galaxy formation. 
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Introduction. — The quantification of large scales 
galaxy correlations and fluctuations is a central problem 
in cosmology. The advent of massive rcdshift surveys has 
ipade possible precise measurements of the galaxy two- 
point correlation function on ten Mpc scale, where power 
lp,w correlations (in redshift space) have been well estab- 
lished [THS]- On larger scales the situation is less clear as 
l^rge density fluctuations on 150 Mpc/h scales were ob- 
served [4l-[TT| : these are not obviously compatible with the 
4bsence of strong clustering on those scales as predicted by 
standard models of galaxy formation [T11[T3] . In addition, 
in a deep sample of very bright galaxies it was observed 
[HHin], on 200 Mpc/h scales, an unexpected strong sig- 
nal with respect to the predictions of the standard model; 
recently an excess of clustering was also found by [T7l[T8] . 

In order to characterize the large scale galaxy corre- 
lations it was measured, in the Sloan Digital Sky Sur- 
vey (SDSS) [19l[20|, the average density n(r ): this was 
shown to present a scaling behavior as n[r) ^ for 
r < 20 Mpc/h inillT]. However, while some authors [21] 
noticed that around 70 Mpc/h it occurs a transition to- 
ward uniformity (i.e. n(r) ^ const.), others concluded 



that n(r) - r~^-^ for 20 Mpc/h < r < 80 Mpc/h [H]. 
Furthermore this latter behavior was shown to be associ- 
ated to a scaling of the variance slower than for a uniform 
distribution and to a probability density function (PDF) 
of galaxy counts in spheres well fitted by a Gumbel distri- 
bution (i.e., different from a simple Gaussian PDF) j^ . 
It was then found that the PDF, in different spatial vol- 
umes, present systematic differences at large enough scales 
which, by making specific tests, were interpreted as due 
to large scale structures [231124]. However, the considered 
tests used non-overlapping sub-samples covering different 
redshift ranges, so that a physical or observational scale 
dependent systematic effect can, at least partially, affect 
the observed behaviors. 

For instance, it was noticed by that number density of 
bright galaxies increases by a factor « 3 as rcdshift in- 
creases from z = 0toz = 0.3 [25], and to explain these ob- 
servations a significant evolution in the luminosity and/or 
number density of galaxies at redshifts z < 0.3 was pro- 
posed. In this context evolution can be parameterized as a 
redshift dependent correction [55]. However, by perform- 
ing several tests to determine the possible effect of evolu- 
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tion on the SDSS data, it was concluded [21] that while the 
evolution may change the galaxy redshift counts behav- 
ior as a function of redshift, but it is unlikely that it can 
produce the large amplitude fluctuations of large spatial 
extension. 

In this letter, by considering SDSS equal volume sam- 
ples covering the same distance scales, we are able to dis- 
entangle finite-size effects due to large scale correlations 
from redshift dependent systematic effects. This allows us 
to confirm the results by [^[^ , to extend them to larger 
scales, i.e. r w 150 Mpc/h, and to clarify several other 
properties of the large scale galaxy distribution. 

The Data. — We have constructed several sub- 
samples of the main-galaxy (MG) sample of the spec- 
troscopic catalog SDSS-DR7 [27l[l8]. The selection cri- 
teria used to construct the volume limited (VL) sam- 
ples are (see [53] for more details): we selected galaxies 
from the MG sample with redshift z € [10~^,0.3] with 
redshift confidence Zconf > 0.35, without significant red- 
shift determination errors and with apparent magnitude 
rrir < 17.77. In order to avoid the irregular edges of the 
survey boundaries, and to consider a simple sky area, we 
have constructed a sample limited, in the internal an- 
gular coordinates of the survey [23], by e [—33, 5°, 36.0°] 
and A e [—48°, 51.5°]. We also consider in what fol- 
lows two half samples of equal volume are limited by {R2) 
A e [-48°, 0] and {R3) A e [0°,51.5°]. 

The survey is known to have a small angular incom- 
pleteness: indeed, in average, about the 5% of the target 
galaxies have not measured redshift [55] . In general, there 
is not a free of a-priori assumptions procedure, to correct 
for such an incompleteness. Indeed, given that the detailed 
information of the real galaxy distribution is unknown one 
has to make some assumptions on the statistical proper- 
ties of such a distribution (see e.g. [T]). As we do not want 
to use such ad-hoc assumptions, as we employ statistical 
methods that aim to tests some of the most common as- 
sumptions in the analysis of galaxy correlations |30j . we 
have adopted the following strategy. From the one hand 
we have considered an angular region which does not in- 
clude the survey edges where completeness varies mostly. 
From the other hand we have done a few tests to con- 
trol the effect of completeness on the correlation analysis, 
and in particular, as we discuss below, we have focused to 
test the stability the results. Note that the incompleteness 
due to fiber collisions can be neglected in measurements 
of large-scale (i.e., r > 10 Mpc/h) galaxy correlations as 
this effects is only relevant very small scales (see [23] and 
references therein). 

In order to construct VL samples (see TabU]) we com- 
puted the metric distances R using the standard cosmo- 



^ The prescription used to define a volume limited sample takes 
into account the luminosity selection eiTects that necessarily affect a 
survey, i.e. that intrinsically faint galaxies are observed only if they 
are located close to the observer while intrinsically bright galaxies are 
observed both if they are nearby to and faraway from the observer. 
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VLl 


50 


200 


-18.9 


-21.1 


73811 


VL2 


125 


400 


-20.5 


-22.2 


129975 


VL3 


200 


600 


-21.6 


-22.8 


51698 



Table 1: Main properties of the SDSS VL samples : Rmin, 
Rmax (in Mpc/h) are the chosen limits for the metric distance; 
Mmin, Mmax define the interval for the absolute magnitude in 
each sample and Np is the number of galaxies in the sample. 



logical parameters, i.e., r^A/ = 0.3 and Ha = 0.7. Ab- 
solute magnitudes M are computed using Petrosian ap- 
parent magnitudes in the filter corrected for Galactic 
absorption and applying standard K-corrections [29) . 

Statistical methods. — We consider the number of 
galaxies Ni{r) within radius r around the i*'' galaxy: this 
quantity differs for each galaxy and hence we consider it 
as a random variable characterizing its statistical proper- 
ties. The average over an ensemble of realizations, {N{r)), 
can be estimated by the volume average (assuming ergod- 
icityjl 

M(r) 

iV,(r) . 



N{r) 



1 



M(r) 



E 

1=1 



(1) 



Note that the number of points contributing to the aver- 
age (EqU]), M{r), depends on the scale r as an effect of 
the requirement that the spheres must be fully included 
in the sample's boundaries [23] • To estimate the typical 
fluctuations of the random variable Ni (r) we may consider 
the conditional variance S^(r) = (iV^(r)) — {N{r)y . In 
general j30j. this can be written as the sum of two con- 
tributions: i;2(r) = Ef (r) + Y.l{r), where T.,{r) is the 
intrinsic part, due to correlations, and Sp(r) ~ {N{r)), 
due to Poisson noise. The normalized variance is defined 



as a 



'(r) = i:'^{r){N{r))- 



^(r) -I- (Tp(r) , where its 



intrinsic part can be estimated by 



N{r) 



M(r) N{r) 



(2) 



Results. — We have measured the conditional average 
density, n{r) = N{r)/V{r) where V{r) = 4/37rr'^, in the 
different SDSS VL samples (see Figlj]). We find that: (i) 
at small scales, r G [0.5, 20] Mpc/h n{r) oc with 7 = 
0.88±0.05 (i.e. N{r) cx r^, with D^3-j = 2.12±0.05). 
The error of the exponent has been derived by measuring 
the scattering in the different samples. 

(ii) At larger scales, i.e. r > 30 Mpc/h and up to 
« 150 Mpc/h in the deepest sample, the exponent is 
7 = 0.2 ±0.05 (i.e. J5 = 2.8 ±0.05). The amplitude of the 
conditional average density in the different VL samples is 
in principle fixed by a luminosity factor that depends on 



^The symbols (...) (— ) stands for the ensemble (volume) average 
performed with the condition that the sphere center coincides with 
a point of the distribution, i.e. it is a conditional average 1301 . 
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Fig. 1: Galaxy average density normalized to its value at 30 
Mpc/h, for the different SDSS samples. The exponent 7 re- 
ported in the labels corresponds to the best fit in the range 
[0.5,20] Mpc/h. 

the absolute luminosity of the galaxies therein included 
However, in a finite sample, as long as correlations 
extend on scales of the order of the sample size, the am- 
plitude of n{r) also depends on the particular structures 
present in that specific volume. Therefore the galaxy 
counts normalization in samples of different sizes, covering 
different space volumes and including galaxies of different 
luminosity depends on systematic effects which become 
negligible only for very large sample sizes. For this reason 
in FiglU by choosing an arbitrary normalization, we have 
plotted n{r)/n{r^,) where r* = 30 Mpc/h. On the other 
hand, note that the exponent of n(r) does not show vari- 
ations larger than the estimated error bar (see below) in 
the different samples. 

The main result from this analysis is that the slopes 
of the galaxy average density, both at small and large 
scales, show a very good agreement in the different sam- 
ples. When the radius r approaches the size of the largest 
sphere included in each sample there are systematic ef- 
fects, as shown by the fact that the large scale tail of n(r) 
(where M{r) in EqU] becomes rapidly very small) grows 
at different scales in the different samples. 

In order to quantify fluctuations affecting the large scale 
(i.e. r > 10 Mpc/h) behavior of n(r) it is necessary to 
estimate statistical errors. For instance, one may sim- 
ply compute them as yjT?{r)V~'^{r)M~'^{r)): however, in 
this way one underestimates the true errors because large 
scale correlations can break the Central Limit Theorem 
and thus the different determinations are not independent 
[50] (note that such error bars would not be even visible in 
the plot in FiglT]). A possible evaluation of the scattering 
of n{r) can be performed by measuring sample-to-sample 
variations [15] . To this aim, we calculate thus Eql2]in the 
different samples (see Figl5]) finding that at small scales 
results are very similar while at large scales a clear differ- 
ence is detected. 

Interestingly the differences in erf (r) occur at a scales 



Fig. 2: Normalized conditional variance (Eq[2]) in the different 
samples: circles represent the intrinsic variance o-f(r), while 
the solid lines the total variance a'^ir) (see Eq|2]). The dashed 
line has a slope as for pure Poisson noise, a behavior that 
describes only the very small scales. 

that grows with sample's sizes. The determination in the 
smallest sample, i.e. VLl, shows a marked large scale 
difference with respect to those in VL2 and VLB, which 
instead present a more similar behavior everywhere but 
the very large scale tail ( i.e. r « 100 Mpc/h). Therefore 
a possible explanation of this large scale behavior is that 
it is due to a finite size effect, i.e. that the scale at which 
the abrupt decay of cr|(r) occurs in the different samples, 
depends on the their sizes. To sho w tha t this is the case we 
have considered the behavior of crf{r) in sub- volumes of 
the VL2 and VL3 samples limited to a depth (i.e., Rmax) 
smaller than the ones reported in Tab|T] In this way we are 
able to clearly single out the finite size effect (see Fig|3]). 

We performed another test by cutting the sample into 
two subsamples of equal volume (i.e., limited in angles by 
the regions R2 and R^) and determining n{r) and (j1{t) 
in each of them (see Fig|3]). In the case of VL3 the de- 
terminations of both the average density and the variance 
in two subsamples agree better than for VLl: this corrob- 
orates the result, obtained also with the previous tests, 
that systematic finite size effects become weaker in sam- 
ples with larger volumes [^[M1[5T] . This is in agreement 
with the larger dimension observed at large scale, which 
also implies less wild fluctuations [30] . 

From these behaviors we may conclude that both n[r) 
and ij'^{r) converge to a well-defined value only for scales 
r < Ts smaller than the radius of the largest sphere in- 
cluded in the sample. We estimate that for VLl r, w 50 
Mpc/h, for VL2 r, w 100 Mpc/h and for VL3 ~ 150 
Mpc/h. Beyond these scales the behaviors in the two half 
samples systematically differ: this is in agreement with 
the results by [Hlinil^llllISI] where the whole behavior of 
the PDF of Ni{r) was considered. 

As mentioned above it is interesting to consider the ef- 
fect of the small (i.e. ~ 5%) angular incompleteness on the 
measurements of galaxy correlations. For instance, there 
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Fig. 3: Normalized conditional variance in subsamples of VL2 
(upper panel) respectively with Rmax = 250,300,400 Mpc/h 
and VL3 (bottom panel) with R^ax = 300,400,500,600 
Mpc/h. 



are some small sky regions where, due to the presence of 
bright stars, galaxies have not been observed. Each bright 
star corresponds to a circular hole where galaxies cannot 
be observed. The radius of the circle around each bright 
star depends on the star apparent magnitude |29j . An up- 
per limit to such a size is = 3 arsec, corresponding to 
very bright stars HH]. A simple way to test for effect of 
such holes is the following. 

We distribute randomly holes of size with a surface 
density of 50 per square degree (roughly corresponding 
the surface density of bright stars [32]). Galaxies which 
are placed inside one of the holes are cut from the re- 
sulting sample. In such a way we artificially have taken 
away from the sample, in a correlated way at small angles, 
about ~ 10% of the galaxies (i.e. about the double than 
the angular incompleteness of the catalog). The question 
is whether the large scales (i.e., r >1 Mpc/h) correlation 
properties are affected by such an incompleteness. Results 
are shown in FiglSj one may note that, a part an obvi- 
ous 10% shift in the amplitude of the average conditional 
density, no scale-dependent changes are manifested. Thus 
we can confidently conclude that incompleteness effects do 
not play a major role in the results of the correlation anal- 



Fig. 4: Normalized conditional variance in the three regions of 
VLl (upper panel) and VL3 (bottom panel). 

ysis. We refer to a forthcoming work for a more complete 
series of tests on the angular incompleteness of this survey. 

Mock galaxy catalogs. — We have repeated the 
same analysis in some mock galaxy samples. These are 
constructed from cosmological N-body simulations of the 
standard LCDM model by applying the same cuts 

in absolute magnitude and distance as those reported in 
TablT]and by computing the redshift positions (i.e., simply 
applying the corrections to the redshift due to the pecu- 
liar velocities along the line of sight). The conditional 
average density (see FigIB]) shows (i) a slope 7 w 1 for 
r £ [0.5, 20]Mpc/h that is weakly dependent on the av- 
erage luminosity of the galaxies, and (ii) a well defined 
crossover to uniformity, i.e. 7 = 0, at ~ 30 Mpc/h. 
Both features are different from the real SDSS samples. 
Correspondingly to the crossover toward uniformity (ab- 
sence of large scales strong correlations), the conditional 
variance (FiglT]) docs not show a finite size dependence, 
contrary to what occurs for the real samples (Figl2]). In 
this case, the stability of the normalized variance in the 
different samples is shown by diving the samples into two 
parts and by comparing the behaviors: because of the 
lack of large scale correlations, no systematic differences 
are found between the two subsamples. 
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10° lo' lo' 

r (Mpc/h) 

Fig. 5: Conditional average density for the full sample VL2 
and for a modified version of it (VL21i) where ~ 10% of the 
galaxies have been cut in a correlated way (see text): a part an 
obvious 10% shift in the amplitude of the average conditional 
density, not scale-dependent changes are manifested. 
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Fig. 6: Conditional average density, normalized to its value at 
30 Mpc/h, for the different samples of the mock galaxy catalog. 
The exponent 7 reported in the labels is the best fit in the range 
[0.5,20] Mpc/h 
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Fig. 7: Normalized conditional variance for the different sam- 
ples of the mock galaxy catalog. No finite size effects are 
present in this case. 



Note that a given mock galaxy sample is constructed 
from the underlying dark matter particles distribution, by 
assuming a certain prescription to identify galaxies. Such 
a prescription is generally based on a physical mechanism 
which links the local density of dark matter particles to 
the probability to form a mock galaxy [34j . In principle, 
one can introduce different physically motived prescrip- 
tions to relate dark to visible matter distributions in cos- 
mological simulations. The question is then whether a 
different prescription can substantially change the result- 
ing mock galaxies correlation properties, possibly giving 
rise to a better agreement with observations. We do not 
explore this question here and we cannot exclude that a 
better agreement could be obtained with a different bias 
prescription. However we note that, given that large-scale 
correlations are not present in the underlying dark matter 
distribution, as this is what theoretical models predicts on 
scales of the order of ^ 100 Mpc/h [T51I55] . the only way to 
introduce such correlations is via a bias mechanism: bias- 
ing should correspond to a large scale correlated selection. 
This implies the biasing mechanism to be non-local, i.e. a 
completely different one from what is generally considered 
to be a physically plausible prescription. 

Discussion. — In summary, we have analyzed the 
scaling properties of the galaxy distribution in the SDSS- 
DR7 samples. We found that at small scales r S [0.5, 20] 
Mpc/h the galaxy (conditional) average density decays as 
a power-law function of distance, n{r) cx r^'' with an ex- 
ponent 7 K, 0.9 while at large scales, i.e. r S [30, 150] 
Mpc/h, it decays slower: 7 w 0.2. The analysis of the 
variance (Eql5|) allows to clearly single out a finite size 
effect, that can be simply understood as due galaxy corre- 
lations extending up to the largest scales of the considered 
samples. These behaviors differ from those found in mock 
galaxy samples where: (i) at small scales the exponent, 
7 « 1, slightly depends on galaxy luminosity, and (ii) at 
large scales, i.e. r > 30 Mpc/h, a well defined crossover 
to uniformity (i.e. 7 = 0) is found. In addition, no finite 
size effects are detected in the large scale behavior of the 
normalized counts variance. 

The detection of very large scale galaxy correlations 
and of finite-size effects, allow us to conclude that the ob- 
served galaxy distribution is more correlated, fluctuating 
and thus clustered, than the one predicted by the standard 
LCDM model of galaxy formation through cosmological N- 
body simulations [33ll34] . These findings are in agreement 
with the results obtained in the same samples, although 
at smaller scales r < 100 Mpc/h, through different tests 
[iniimilSI], with the results obtained in the 2 degree 
field redshift survey [IHS] and other surveys P^HTB] . Note 
that the results obtained by [21] are also compatible with 
a slow decay of n{r) at large scales. 

Because of the large-scale scaling of the galaxy average 
density we conclude that any statistical quantity normal- 
ized to the estimation of the sample average density (e.g., 
the standard two-point correlation function) is biased by 
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finite size effects [50] • Tfiis implies tliat the volume of cur- 
rent galaxy samples is not yet large enough to measure the 
standard two-point correlation function at ^ 100 Mpc/h. 

From a theoretical point of view, the main challenge of 
our results concerns the way in which the large scale uni- 
verse is modeled. Indeed the deviation from a pure statis- 
tically homogeneous and isotropic, and spatially uniform 
density field |31| imply the consideration of the inhomo- 
geneities effects on the dynamics of the universe and the 
deviations that can possibly be introduced with respect to 
the simple FRW models 
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